Goto

Collaborating Authors

 Boulder


'Unethical' AI research on Reddit under fire

Science

A study that used artificial intelligence–generated content to "participate" in online discussions and test whether AI was more successful at changing people's minds than human-generated content has caused an uproar because of ethical concerns about the work. This week some of the unwitting research participants publicly asked the University of Zürich (UZH), where the researchers behind the experiment hold positions, to investigate and apologize. "I think people have a reasonable expectation to not be in scientific experiments without their consent," says Casey Fiesler, an expert on internet research ethics at the University of Colorado Boulder. A university statement emailed to Science says the researchers--who remain anonymous--have decided not to publish their results. The university will investigate the incident, the statement says.


IceBench: A Benchmark for Deep Learning based Sea Ice Type Classification

arXiv.org Artificial Intelligence

Sea ice plays a critical role in the global climate system and maritime operations, making timely and accurate classification essential. However, traditional manual methods are time-consuming, costly, and have inherent biases. Automating sea ice type classification addresses these challenges by enabling faster, more consistent, and scalable analysis. While both traditional and deep learning approaches have been explored, deep learning models offer a promising direction for improving efficiency and consistency in sea ice classification. However, the absence of a standardized benchmark and comparative study prevents a clear consensus on the best-performing models. To bridge this gap, we introduce \textit{IceBench}, a comprehensive benchmarking framework for sea ice type classification. Our key contributions are threefold: First, we establish the IceBench benchmarking framework which leverages the existing AI4Arctic Sea Ice Challenge dataset as a standardized dataset, incorporates a comprehensive set of evaluation metrics, and includes representative models from the entire spectrum of sea ice type classification methods categorized in two distinct groups, namely, pixel-based classification methods and patch-based classification methods. IceBench is open-source and allows for convenient integration and evaluation of other sea ice type classification methods; hence, facilitating comparative evaluation of new methods and improving reproducibility in the field. Second, we conduct an in-depth comparative study on representative models to assess their strengths and limitations, providing insights for both practitioners and researchers. Third, we leverage IceBench for systematic experiments addressing key research questions on model transferability across seasons (time) and locations (space), data downscaling, and preprocessing strategies.


Data-Driven Probabilistic Air-Sea Flux Parameterization

arXiv.org Machine Learning

Accurately quantifying air-sea fluxes is important for understanding air-sea interactions and improving coupled weather and climate systems. This study introduces a probabilistic framework to represent the highly variable nature of air-sea fluxes, which is missing in deterministic bulk algorithms. Assuming Gaussian distributions conditioned on the input variables, we use artificial neural networks and eddy-covariance measurement data to estimate the mean and variance by minimizing negative log-likelihood loss. The trained neural networks provide alternative mean flux estimates to existing bulk algorithms, and quantify the uncertainty around the mean estimates. Stochastic parameterization of air-sea turbulent fluxes can be constructed by sampling from the predicted distributions. Tests in a single-column forced upper-ocean model suggest that changes in flux algorithms influence sea surface temperature and mixed layer depth seasonally. The ensemble spread in stochastic runs is most pronounced during spring restratification.


Empowering the Future Workforce: Prioritizing Education for the AI-Accelerated Job Market

arXiv.org Artificial Intelligence

Lisa Amini (IBM Research), Henr y F. Kor th (Lehigh University), Nita Patel (Otis), Evan Peck (University of Colorado Boulder), Ben Zorn (Microsoft) It is believed by some that we are entering a new age of technology, characterized by advanced, per vasive Ar tificial Intelligence (AI), during which the rate of workforce and economic disruption will be substantially greater than previous periods. Regardless of whether a new era has commenced, AI is increasing in capability, speeding integration into the workplace and our homes, and prevailing in both technical and non-technical contexts and occupations. New skills and professions -- many of which are not yet conceived -- will arise, as will widespread job displacement. Just as the Information Age required national imperatives for computing education, similar imperatives exist for the rise of AI. In a sur vey of 4702 CEOs, 70 percent say AI will significantly change the way their companies create, deliver, and capture value over the next three years, and 45 percent believe their companies will no longer be viable in ten years if they continue on their current path.


Building Machine Learning Challenges for Anomaly Detection in Science

arXiv.org Artificial Intelligence

Scientific discoveries are often made by finding a pattern or object that was not predicted by the known rules of science. Oftentimes, these anomalous events or objects that do not conform to the norms are an indication that the rules of science governing the data are incomplete, and something new needs to be present to explain these unexpected outliers. The challenge of finding anomalies can be confounding since it requires codifying a complete knowledge of the known scientific behaviors and then projecting these known behaviors on the data to look for deviations. When utilizing machine learning, this presents a particular challenge since we require that the model not only understands scientific data perfectly but also recognizes when the data is inconsistent and out of the scope of its trained behavior. In this paper, we present three datasets aimed at developing machine learning-based anomaly detection for disparate scientific domains covering astrophysics, genomics, and polar science. We present the different datasets along with a scheme to make machine learning challenges around the three datasets findable, accessible, interoperable, and reusable (FAIR). Furthermore, we present an approach that generalizes to future machine learning challenges, enabling the possibility of large, more compute-intensive challenges that can ultimately lead to scientific discovery.


Mind the Gap: Bridging the Divide Between AI Aspirations and the Reality of Autonomous Characterization

arXiv.org Artificial Intelligence

What does materials science look like in the "Age of Artificial Intelligence?" Each materials domain-synthesis, characterization, and modeling-has a different answer to this question, motivated by unique challenges and constraints. This work focuses on the tremendous potential of autonomous characterization within electron microscopy. We present our recent advancements in developing domain-aware, multimodal models for microscopy analysis capable of describing complex atomic systems. We then address the critical gap between the theoretical promise of autonomous microscopy and its current practical limitations, showcasing recent successes while highlighting the necessary developments to achieve robust, real-world autonomy.


Forecasting Local Ionospheric Parameters Using Transformers

arXiv.org Artificial Intelligence

Accurate and efficient modeling of Earth's ionosphere has a significant impact on research and operational communities due to its effects on radio communications, radar performance, [1, 2, 3] and satellite drag [4]. Success in forecasting key parameters such as the F2 layer critical frequency (foF2) and height (hmF2) and the total electron content (TEC) allows one to anticipate and mitigate the impacts of ionospheric variability on such systems. Over the past decades, many modeling approaches have been developed to predict these ionospheric parameters with increasing accuracy and skill. These models may be broadly categorized as empirical, physics-based, and, more recently, machine learning methods. Empirical models often rely on extensive historical datasets to establish statistical relationships between ionospheric parameters and geophysical variables. The International Reference Ionosphere (IRI) model [5] is a widely used standard that provides monthly averages of various ionospheric parameters based on many decades of past observations. IRI has seen continual development and improvements over the years, adding a host of submodels used to capture specific aspects of the ionosphere such as the CCIR [6, 7] and URSI [8] foF2 models for representing the diurnal variations of the peak plasma density across the globe, the AMTB [9] and SHU-2015 [10] models for even more harmonic expansions of hmF2, and NeQuick 2 [11] for improved topside electron density accuracy and thus better estimates of TEC [12, 13]. So, while large empirical models like IRI continue to improve, the number of these available options needed to address each domain and source of variance in the ionosphere also grows, and choosing the appropriate settings may be prohibitive without expert knowledge of each submodel.


How Users Who are Blind or Low Vision Play Mobile Games: Perceptions, Challenges, and Strategies

arXiv.org Artificial Intelligence

As blind and low-vision (BLV) players engage more deeply with games, accessibility features have become essential. While some research has explored tools and strategies to enhance game accessibility, the specific experiences of these players with mobile games remain underexamined. This study addresses this gap by investigating how BLV users experience mobile games with varying accessibility levels. Through interviews with 32 experienced BLV mobile players, we explore their perceptions, challenges, and strategies for engaging with mobile games. Our findings reveal that BLV players turn to mobile games to alleviate boredom, achieve a sense of accomplishment, and build social connections, but face barriers depending on the game's accessibility level. We also compare mobile games to other forms of gaming, highlighting the relative advantages of mobile games, such as the inherent accessibility of smartphones. This study contributes to understanding BLV mobile gaming experiences and provides insights for enhancing accessible mobile game design.


Exploring Exploration in Bayesian Optimization

arXiv.org Artificial Intelligence

A well-balanced exploration-exploitation trade-off is crucial for successful acquisition functions in Bayesian optimization. However, there is a lack of quantitative measures for exploration, making it difficult to analyze and compare different acquisition functions. This work introduces two novel approaches - observation traveling salesman distance and observation entropy - to quantify the exploration characteristics of acquisition functions based on their selected observations. Using these measures, we examine the explorative nature of several well-known acquisition functions across a diverse set of black-box problems, uncover links between exploration and empirical performance, and reveal new relationships among existing acquisition functions. Beyond enabling a deeper understanding of acquisition functions, these measures also provide a foundation for guiding their design in a more principled and systematic manner.


WENDy for Nonlinear-in-Parameter ODEs

arXiv.org Machine Learning

The Weak-form Estimation of Non-linear Dynamics (WENDy) algorithm is extended to accommodate systems of ordinary differential equations that are nonlinear-in-parameters (NiP). The extension rests on derived analytic expressions for a likelihood function, its gradient and its Hessian matrix. WENDy makes use of these to approximate a maximum likelihood estimator based on optimization routines suited for non-convex optimization problems. The resulting parameter estimation algorithm has better accuracy, a substantially larger domain of convergence, and is often orders of magnitude faster than the conventional output error least squares method (based on forward solvers). The WENDy.jl algorithm is efficiently implemented in Julia. We demonstrate the algorithm's ability to accommodate the weak form optimization for both additive normal and multiplicative log-normal noise, and present results on a suite of benchmark systems of ordinary differential equations. In order to demonstrate the practical benefits of our approach, we present extensive comparisons between our method and output error methods in terms of accuracy, precision, bias, and coverage.